DOCOHBRT BBSQBB 



BD 088 284 



Ft OOa 971 



JkOTHOR 
TITLE 

FOB DATE 
NOTE 

EDBS PRICE 
DESCBIPTOBS 



Zelson, Sidney M. J. 

Heasureient and Evaluation of Speaking Skill in a 

Second Language. 

C72] 

lOp. 

HF-$0.75 HC-$1.50 

*Achieveaent Tests; Audiolingual Skills; ^Language 
Instruction; ^Language Tests; Listening Tests; 
*Hode£n Languages; Phonology; ^Speech Skills; Student 
Evaluation; Syntax; Test Construction; Test Validity; 
Vocabulary 



ABSTRACT 

A theoretical discussion of probleas encountered in 
the ■easurement and evaluation of speaking skill in a second language 
is developed in this paper. The priaary areas of interest to be 
evaluated are identified and discussed, including phonologyi. syntax, 
aorphology, and vocabulary. A delimited and graduated aethod cf 
evaluation using nuaerical scores is outlined in each of the target 
areas. (RL) 



ERIC 



Ul OtPARTMBNTO^MPALTM. 
■ OUCATJON 4 weLFARi 
NATIONAL JNlTJTUTt OF 



MEASUREMENT AND EVALUATION OF 
SPEAKING SKILL IN A SECOND LANGUAGE 




Sidney N. J* Zelson 
SUCNY at Buffalo 



The testing of second language speaking skills presents a 



variety of problems: (1) a certain amount of equipment Is often 
needed to administer such a test; (2) certain facilities are re- 
quired If the examiner wishes to prevent one student from being 
Influenced by another's answers; (3) whether the examinee's per- 
formance Is recorded, then scored, or administered Individually 
and scored at the same time a lack of Inter-itorer reliability and 
Intra-scorer reliability may detract from the accuracy of the 
measure; (4) some types of activities may go somewhat beyond test- 
ing of speaking skills only — competency In listening and reading 
may be a vital factor; (5) frequently^ the speaking task calls 
for behavior that Is as dependent upon Imagination and/or reason- 
ing as It Is upon linguistic or communicative competence; (6) we 
really do not know whether personality factors Influence the 
examinee's performance In a speaking test to a greater degree 
than on other measures. If the tasks are novel, this problem 
may be all the more acute. There might well be two students, for 
example, with similar oral competence, one of whom Is much more 
Inhibited by the test situation. (7) The time element Involved 
In scoring these measures Is overwhelming. 

Does a frequent formal evaluation alleviate some of these 
difficulties? We really don't know, but the Idea seems to have 



I 



-2- 

some face validity. Ave we ready to spend the amount of tlue and 
effort necessary for such an undertaking? Is It useful to practice 
such tasks so that at least they will not be unfamiliar to the stu* 
dent? It may be a reasonable alternative that can minimize at 
least a few of the above-mentioned problems. 

A number of decisions should be made. Do we wish to examine 
linguistic competence or communicative competence or both? Are 
they equally Important? Are we interested In giving structured or 
unstructured tasks? How Important Is pronunciation? How Important 
are other phonological features? Is phonology equal In Importance 
to syntax, morphology, and vocabulary? 

As we consider the testing of such categories as the four 
mentioned above. It Is useful to keep In mind the unanswered ques- 
tions that were previously posed. There are procedures that we 
might select that can help us to obtain measures that are less con- 
founded by factors other than speaking competency. A variety of 
techniques may be used to test the four categories bimul taneously , 
though phonology will be seriously slighted and dlagnos tlcally such 
techniques will be relatively Ineffective. Yet such tasks as those 
to be sugg;ested seem much more appropriate In a summatlve evaluation 
which seems to be our goal. In this case. 

(1) A student is told that he Is to prepare to Interview 
a visitor to our country, a visitor who speaks no English. The 
student Is given a list of items of information he must obtain, 
( supplied to him in English) and is asked to record his questions.^ 



ERIC 



ERLC 



-3- 

a. Where was she born? 

h. When does she have to leave for her country? 

c. How long has she been here? 

d« Does she have any brothers or sisters? 

e. What are their names? 

£• Where are her parents? 

g« How Is she today? 

h • Why doesn * t she speak Engl Ish? 

1. Where Is she going to spend the summer? 

j. Where would she like to live? 

k. What Is her favorite season? 

1. What Is the weather like In the winter In her 
country? 

m • Did the French government (Spanish, German) send 
her? 

n. What did she do on Sundays when she was a little 
girl? 

o. Would she stay In the United States If she could? 
p. What does she want most to see In this country? 
r. What did she do first upon arriving In this city? 
s. Who Is the tallest one In her family? 
t. Does she drink wine, milk, coffee or tea with her 
meals? 

One may observe that such questions Involve a common vocabulary, a 
range of syntactic structures that usually differ from their English 
counterparts, numerous areas of pronunciation and intonation, and 
several tenses, though there are few items that test direct or in- 
direct object pronouns. 

If all the students were recording simultaneously, they could 
be slven varied sequences in which to ask their questions — perhaps 
each student could be instructed to start with a different number. 

In that way the scorer could be more confident that each tape repre- 

2 

sented the student's own work. 

(2) The student may be instructed to tell a story from a 
series of pictorial cues. He is given a few minutes for preparation, 

It may be noted that the student's imagination may be a prime 
factor in his performance. His score may also be affected by the 
degree of complexity with which he expresses himself and the area 



-4- 

that he chooses to develop In his comments. It may happen, also, 
that he doesn't know vocabulary In a particular area. For tlioa^ 
reasons It may be advisable to give him three sequences and count 
only the best two In his score. Provided the student has not had 
previous exposure to the specific context. It may be that the s tu- 
dent's best performance gives the most valid appraisal of his speak** 
Ing ability. 

(3) The stivient may be supplied with a number of topics 
from which he may choose one to discuss, or he may be given several 
situations or roles to play, and asked to give a short monolgoue. 
For each of these tasks he Is allowed time for preparation. 

(4) The student discusses a picture at some length (with or 
without English cues) or makes a relevant statement about each pic- 
ture of a series. 

(5) The student Is given one half of a dialog and asked to 
prepare to participate. 

This type of exercise calls for reading competence. If the 
dialog Is written In the second language. Furthermore, lack of 
familiarity with one lexical Item In one utterance may well affect 
performance throughout the remainder of the test. In such a test 
most or all of the previously mentioned variables can come Into play, 
either Inddvldually or In an Interaction. 

A common scoring procedure for evaluating such performances 
as those suggested Involves observing four areas: (1) fluency, 
(2) pronunciation, (3) grammar, and (4) vocabulary . Whether or 
not they are weighted equally Is a point for consideration, through 

ERLC 



-5- 



the literature ImpiMes that they are given equal Importance. A 

system has appeareri In recent literature, one that seems quite 

adaptable and that defines or delimits each graduation more clearly 

3 

than several others. 

Pronunciation 

Fhonemlcally accurate pronunciation throughout 4 
Occasional phonemic error, but generally com- 
prehensible 3 
Many phonemic errors: very difficult to per- 
ceive meaning 2 
Incomprehensible, or no response 1 

Vocabulary 

Consistent use of appropriate words throughout 4 
Minor lexical problems, but vocabulary gener- 
ally appropriate 3 
Vocabulary usually Inaccurate, except for 
occasional correct word 2 
Vocabulary Inaccurate throughout, or no response 1 

Structure 

No error s of morphology or syntax 4 
Generally accurate structure, occasional slight 
error 3 
Errors of bas ic stuuc ture, but some phrases 
rendered correctly 2 
Virtually no correct structures, or no response 1 

Fluency 

Speech Is natural and continuous . Any pauses 
correspond to those which might have been made 
In native language (original text reads "made 
by a native speaker'*). 4 
Speech Is generally natural and continuous . ' 
Occasional slight stumblings or pauses at un- 
natural points In the utterance . 3 
Some definite stumbling , but manages to re- 
phrase and continue . 2 
Long pauses , utterances left unfinished , or no 
response. 1 

The writer finds this scale more realistic, particularly for students 

at less advanced levels, than several others that have been outlined. 



ERIC 



-6- 

Speaking vocabulary may be tested separately with techniques 
such as response to plcturlal cues, as well as with those procedures 
already discussed, but It may be more useful and more practicable 
to Incorporate that category In other tests* 

We may wish to evaluate the student's control of phonological 
features of the second language, 

a, segmental phonemes 

b, suprasegmental phonemes: they may be observed 
separately or as intonation 

1. stress 

2. pitch 

3. juncture 

c, rhythm and other features 

(1) The examinee listens to an utterance and repeats it« 
The scorer observes elements that have been predetermined but not 
pointed out to the examinee. Usually two, but no more than three 
elements may be observed per ut te ranee . 

Echo type items test the student's ability to reproduce ap- 
propriate features but not his comprehension of the principles 
involved, which is also a requisite for authentic speech, i.e., he 
may no t perform similarly in a non-repet Itiye task« 

(2) The examinee Is given a ser:^.es of sentences or expres- 
sions to read. Student performance is evaluated as per (1). 

It is sometimes argued that orthographic symbols may cause 
the student to make errors where he might not in actual conversation. 

ERIC 



(3) The student may be Instructed to answer a simple ques- 
tion, or to respond to a simple utterance. He may be prompted with 
a cue by this means the response can be structured to a greater 
degree • However , this task obviously calls for comprehension also • 

(4) The student may be asked to respond to pictorial cues; 
name the picture, tell what the person Is doing, tell what time It 
Is, what the person has In his hand, dnd other tasks of that type. 

(5) The student may be asked to give one or more pattern 
drill responses. A simple substitution exercise, or another low 
level task, will lessen the l:f.kellhood of syntactical problems. 

The scoring of the production of phonological elements will 
likely take one basic form, thou{^h slightly different sets of crl^ 
terla may be used. If the goal Is phonetic accuracy, the Items 
might be marked on a pass/fall, natlve/non*-natlve, authentlc/unau th- 
entlc or with some other similar terminology; another possible treat- 
ment may be a three position scale — phonetic accurecy-phonemlc 
accuracy-unacceptable. The choice of one or the other would be 
determined by previously-established objective^. The use of accept- 
able/unacceptable ratings, without further definition In more precise 
terminology, may cause one to question the reliability of such an 
appraisal as well as Its meanlngf ulness . 

A student * s control of morphology and syntax In a second lan- 
guage may be measured In the following ways. 

Procedures Three through Five from Phonology section 
(see pp. 6-7) may be adapted to the testing of morphology and syntax* 

O 

ERIC 



-8- 

(2) Procedure One from General Speaking Test section (see 
pp. 2-3) may be used effectively. 

(3) The student responds to pattern drills of several var- 
ieties . 

It must be noted here that such an exercise may discriminate 
against those students who are not accustomed to using pattern drills. 
A more complex type of drill would aggravate the problem all the 
more . 

(4) The student Is Instructed to "express the following 
Ideas*' (supplied In English). Only predetermined elements are 
scored . 

(5) The student Is supplied with a series of dehydrated sen- 
tences In the second language. Each Is given with a model, so as 

to structure the response to the extent that the examiner wishes. 
The student records his responses. 

The scoring of morphemic structures may be done on a correct/ 
Incorrect basis. Syntactic structures may be treated the same way 
or one point may be allowed for choice of the correct structure and 
one point for all correct forms within that structure. 

Various techniques of evaluation of the speaking skills and 
sub-skills have been set forth and discussed . An effort has been 
made to outline some of the limitations, some of the variables that 
may detract from the validity of our measures. The primary consid- 
eration In the choice of one procedure or another has been the ex- 
tent to which. In the judgment of the writer, speaking skills and 
only speaking skills were tested, though several tasks were Included 
that are not pure tests of speaking ability. 

ERIC 



-9- 

While It Is seldom If ever possible to eliminate all extran"- 
eous factors, It Is Incumbent upon the test constructor to eliminate 
or minimize those which he can and to be aware of those which he has 
not and can not. Only to that extent can be confident of the 
accuracy of the Information we have gathered. 



-10 



FOOTNOTES 



Many o.; the questions are suggested by those In a dissertation 
proposal by Patricia Powell, "An Investigation of Selected Syn- 
tactical and Morphological Structures In the Conversation of High 
School Students After Two Years' Study of French," Additional 
items of this type may be found in "Directed Composition," Review 
Text in French Two Years (New York: Amsco Publications, 19S4) 9 
p. ^§9* 



The writer has noted, in the scoring of speaking-test tapes, in 
installations of several kinds, that it is common to hear responses 
other than from student who is being scored. They often come through 
quite clearly, and immediately prior to the response that is to be 
evaluated. 



John L* D, Clark, Foreign Language Testing; Theory and Practice , 
(Philadelphia: Center for Curr:^.culum Development, 1972), p, 93. 



